Arabic anaphora resolution: corpora annotation with coreferential links
نویسندگان
چکیده
Annotated resources are much needed for evaluation and training of anaphora resolution systems. The coreferential chain annotation is a difficult task which can not be realised without an appropriate tool. In this paper, we present our work on Arabic corpora annotation with anaphoric links (i.e., the annotation of the identity relation between the anaphors and their antecedents). In particular, we propose an anaphoric annotating tool for Arabic. Anaphoric annotating tool for Arabic has the advantage of automatic detection of Arabic pronouns and allows the human annotator to select several anaphoric pronouns related to the same antecedent. Our aim is to build a real corpus which will be used for anaphora resolution (i.e., either for system training or evaluation).
منابع مشابه
Multilingual corpora with coreferential annotation of person entities
This paper presents three corpora with coreferential annotation of person entities for Portuguese, Galician and Spanish. They contain coreference links between several types of pronouns (including elliptical, possessive, indefinite, demonstrative, relative and personal clitic and non-clitic pronouns) and nominal phrases (including proper nouns). Some statistics have been computed, showing distr...
متن کاملCLinkA A Coreferential Links Annotator
The annotation of coreferential chains in a text is a difficult task, which requires a lot of concentration. Given its complexity, without an appropriate tool it is very difficult to produce high quality coreferentially annotated corpora. In this paper we discus the requirements for developing a tool for helping the human annotator in this task. The annotation scheme used by our program is deri...
متن کاملWhere Anaphora and Coreference Meet. Annotation in the Spanish CESS-ECE Corpus
This paper describes the guidelines of the annotation scheme designed to enrich the Spanish CESS-ECE corpus with coreference information, which is a significant step towards the definition of an exhaustive typology of pronominal and full NP coreferential expressions and their relations for Spanish. The goal is twofold. From a computational perspective, this work establishes the formal foundatio...
متن کاملPredicative NPs and the annotation of reference chains
In the development of machine learning systems for identification of reference chains, hand-annotated corpora play a crucial role. This paper concerns the question of how predicative NPs should be annotated w.r.t. coreference in corpora for such systems. This question highlights the tension that sometimes appears in the development of corpora between linguistic considerations and the aim for pe...
متن کاملLearning Dutch Coreference Resolution
This paper presents a machine learning approach to the resolution of coreferential relations between nominal constituents in Dutch. It is the first significant automatic approach to the resolution of coreferential relations between nominal constituents for this language. The corpus-based strategy was enabled by the annotation of a substantial corpus (ca. 12,500 noun phrases) of Dutch news magaz...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Int. Arab J. Inf. Technol.
دوره 6 شماره
صفحات -
تاریخ انتشار 2009